Variable selection and estimation in generalized linear models with the seamless L0 penalty.

نویسندگان

  • Zilin Li
  • Sijian Wang
  • Xihong Lin
چکیده

In this paper, we propose variable selection and estimation in generalized linear models using the seamless L0 (SELO) penalized likelihood approach. The SELO penalty is a smooth function that very closely resembles the discontinuous L0 penalty. We develop an e cient algorithm to fit the model, and show that the SELO-GLM procedure has the oracle property in the presence of a diverging number of variables. We propose a Bayesian Information Criterion (BIC) to select the tuning parameter. We show that under some regularity conditions, the proposed SELO-GLM/BIC procedure consistently selects the true model. We perform simulation studies to evaluate the finite sample performance of the proposed methods. Our simulation studies show that the proposed SELO-GLM procedure has a better finite sample performance than several existing methods, especially when the number of variables is large and the signals are weak. We apply the SELO-GLM to analyze a breast cancer genetic dataset to identify the SNPs that are associated with breast cancer risk.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Florida State University College of Arts and Sciences Theories on Group Variable Selection in Multivariate Regression Models

We study group variable selection on multivariate regression model. Group variable selection is selecting the non-zero rows of coefficient matrix, since there are multiple response variables and thus if one predictor is irrelevant to estimation then the corresponding row must be zero. In a high dimensional setup, shrinkage estimation methods are applicable and guarantee smaller MSE than OLS acc...

متن کامل

Variable Selection and Estimation with the Seamless-l0 Penalty

Penalized least squares procedures that directly penalize the number of variables in a regression model (L0 penalized least squares procedures) enjoy nice theoretical properties and are intuitively appealing. On the other hand, L0 penalized least squares methods also have significant drawbacks in that implementation is NP-hard and computationally unfeasible when the number of variables is even ...

متن کامل

Penalized Bregman Divergence Estimation via Coordinate Descent

Variable selection via penalized estimation is appealing for dimension reduction. For penalized linear regression, Efron, et al. (2004) introduced the LARS algorithm. Recently, the coordinate descent (CD) algorithm was developed by Friedman, et al. (2007) for penalized linear regression and penalized logistic regression and was shown to gain computational superiority. This paper explores...

متن کامل

A Majorization-minimization Approach to Variable Selection Using Spike and Slab Priors

We develop a method to carry out MAP estimation for a class of Bayesian regression models in which coefficients are assigned with Gaussian-based spike and slab priors. The objective function in the corresponding optimization problem has a Lagrangian form in that regression coefficients are regularized by a mixture of squared l2 and l0 norms. A tight approximation to the l0 norm using majorizati...

متن کامل

Variable Selection via Penalized Likelihood

Variable selection is vital to statistical data analyses. Many of procedures in use are ad hoc stepwise selection procedures, which are computationally expensive and ignore stochastic errors in the variable selection process of previous steps. An automatic and simultaneous variable selection procedure can be obtained by using a penalized likelihood method. In traditional linear models, the best...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • The Canadian journal of statistics = Revue canadienne de statistique

دوره 40 4  شماره 

صفحات  -

تاریخ انتشار 2012